This project will let you discover in detail a UNIX mechanism (pipes) that you already know by using it in your program.
Your program will be executed as follows:
./pipex file1 cmd1 cmd2 file2
It must take 4 arguments
file1 & file2 are file names cmd1 and cmd2 are shell commands with their parameters.
I must behave exactly the same as this shell command
$> < file1 cmd1 | cmd2 > file2
Your program will be executed as follows:
$> ./pipex file1 cmd1 cmd2 cmd3 ... cmdn file2
Support « and » when the first parameter is "here_doc".
$> ./pipex here_doc LIMITER cmd cmd1 file
Should behave like:
cmd << LIMITER | cmd1 >> file
In mandatory part argc is 5. 5 minus pipex commnad minus infile minus outfile i have a two-commands pipe (5 - 3). In bonus part argc is n. I have a (n - 3)-commnds pipe.
For file1 check that there are reading permits For file2 check that there are writing permits
check that either cmd1 or cmd2 are found in any path.
I use the function arg_ok that verifies those four arguments, prints all error messages related to such verification. If all four arguments are correct, arg_ok returns a struct with the right values to pass to the children processes. The struct holds a all_ok variable that indicates if all filenames and commands are reachable. Also, max_cmds and num_cmds help to manage the cmds malloc
typedef struct s_pipex_args
{
int max_cmds;
int num_cmds;
char *infile;
char *outfile;
char **cmds;
int all_ok;
} t_pipex_args;
source, alias, bg, bind, break, builtin, caller, cd, command, compgen, complete, compopt, continue, declare, typeset, dirs, disown, echo, enable, eval, exec, exit, export, fc, fg, getopts, hash, help, history, jobs, kill, let, local, logout, mapfile, readarray, popd, printf, pushd, pwd, read, readonly, return, set, shift, shopt, suspend, test, times, trap, type, ulimit, umask, unalias, unset, wait,
MY linux box has 7717 commands
There are 18 allowed functions for Pipex. All of them belong to the set of 51 functions allowed in minishell. I summarize them here. It is work in advance to understand minishell.
Library | Function | Description |
---|---|---|
fcntl.h | open | The open() system call opens/creates the file specified by pathname in read/write mode if currente permission allow it. I used it with file1 and file2. O_CLOEXEC Enable the close‐on‐exec flag for the new file descriptor. It is essential in some multithreaded programs to avoid race conditions where one thread opens a file descriptor and attempts to set its close‐on‐exec flag using fcntl(2) at the same time as another thread does a fork(2) plus execve(2). Depending on the order of execution, the race may lead to the file descriptor returned by open() being unintentionally leaked to the program executed by the child process created by fork(2). |
stdio.h | perror | The perror() function produces a message on standard error describing the last error encountered during a call to a system or library function. The argument string s is printed, followed by a colon and a blank. Then an error message corresponding to the current value of errno and a new‐line. Joining "func" and "LINE" to string s i managed a more verbose error message. |
stdlib.h | exit | The exit() function causes normal process termination and the least significant byte of status (i.e., status & 0xFF) is returned to the parent. The C standard specifies two constants, EXIT_SUCCESS and EXIT_FAILURE, that may be passed to exit() to indicate successful or unsuccessful termination, respectively. All open streams are flushed and closed. Temporal files created are removed. |
stdlib.h | free | The free() function frees the memory space pointed to by ptr, which must have been returned by a previous call to malloc(). |
stdlib.h | malloc | The malloc() function allocates size bytes and returns a pointer to the allocated memory. The memory is not initialized. |
string.h | strerror | The strerror() function returns a pointer to a string that describes the error code passed in the argument errnum or "Unknown error nnn" if the error number is unknown. This string must not be modified by the application. |
sys/wait.h | wait | All of these system calls are used to wait for state changes in a child of the calling process, and obtain information about the child whose state has changed. A state change is considered to be:
In the case of a terminated child, performing a wait allows the system to release the resources associated with the child; if a wait is not performed, then the terminated child remains in a "zombie" state (see NOTES below). The wait() system call suspends execution of the calling thread until one of its children terminates. |
sys/wait.h | waitpid | The waitpid() system call suspends execution of the calling thread until a child specified by pid argument has changed state. By default, waitpid() waits only for terminated children, but this behavior is modifiable via the options argument, as described below. |
unistd.h | access | access() checks whether the calling process can access the file pathname. With mode F_OK cheks if file exists. On success, zero is returned. On error -1 is returned and errno indicates the error. |
unistd.h | close | close() closes a file descriptor, so that it no longer refers to any file and may be reused. Any record locks (see fcntl(2)) held on the file it was associated with, and owned by the process, are removed (regardless of the file descriptor that was used to obtain the lock). If fd is the last file descriptor referring to the underlying open file description (see open(2)), the resources associated with the open file description are freed; if the file descriptor was the last reference to a file which has been removed using unlink(2), the file is deleted. Returns zero on succsess. On error returns -1 and errno indicates the error. |
unistd.h | dup | The dup() system call allocates a new file descriptor(unused lowest-numbered) that refers to the same open file description as the descriptor oldfd.After a successful return, the old and new file descriptors may be used interchangeably. Since the two file descriptors refer to the same open file description, they share file offset and file status flags |
unistd.h | dup2 | The dup2() system call performs the same task as dup(), but instead of using the lowest‐numbered unused file descriptor, it uses the file descriptor number specified in newfd. In other words, the file descriptor newfd is adjusted so that it now refers to the same open file description as oldfd. If the file descriptor newfd was previously open, it is closed silently (no error reported) before being reused. on success returns the new file descriptor. On error returns -1 and errno indicates the error. |
unistd.h | execve | execve() executes the program referred to by pathname, that can be a binary executable or a script starting with a shebang line. This causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and (initialized and uninitialized) data segments. execve() does not return on success, and the text, initialized data, uninitialized data (bss), and stack of the calling process are overwritten according to the contents of the newly loaded program. on error, -1 is returned and errno indicates the error. |
unistd.h | fork | fork() creates a new process by duplicating the calling process. The new process is referred to as the child process. The calling process is referred to as the parent process.The child inherits copies of the parent’s set of open file descriptors. Each file descriptor in the child refers to the same open file description (see open(2)) as the corresponding file descriptor in the parent. On success, the PID of the child process is returned in the parent, and 0 is returned in the child. On failure, -1 is returned in the parent, no child process is created, and errno is set to indicate the error. |
unistd.h | pipe | pipe() creates a pipe, a unidirectional data channel that can be used for interprocess communication. The array pipefd is used to return two file descriptors referring to the ends of the pipe. pipefd[0] refers to the read end of the pipe. pipefd[1] refers to the write end of the pipe. Data written to the write end of the pipe is buffered by the kernel until it is read from the read end of the pipe. |
unistd.h | read | read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.If count is zero, read() may detect error EISDIR to check that fd is a directory. on success, the number of bytes read is returned. |
unistd.h | unlink | unlink() deletes a name from the filesystem. If that name was the last link to a file and no processes have the file open, the file is deleted and the space it was using is made available for reuse. |
unistd.h | write | write() writes up to count bytes from the buffer starting at buf to the file referred to by the file descriptor fd. If count is zero and fd refers to a regular file, then write() may return a failure status EPIPE for a file descriptor conected to a pipe whose reading end is closed |
I found two methods for getting the running enviroment of my a program
In ISO C you can define main either to take no arguments, or to take two arguments that represent the command line arguments to the program, like this:
int main (int argc, char *argv[])
In Unix systems you can define main a third way, using three arguments:
int main (int argc, char *argv[], char *envp[])
The first two arguments are just the same. The third argument envp gives the program’s environment; it is the same as the value of environ.
See Environment Variables.
POSIX.1 does not allow this three-argument form, so to be portable it is best to write main to take two arguments, and use the value of environ.
#include <stdio.h>
int main(int argc, char **argv, char **envp)
{
int n;
n = 0;
while (*envp != NULL)
{
printf("%s\n", *envp);
envp++;
}
}
The variable environ points to an array of pointers to strings called the "environment". The last pointer in this array has the value NULL.
This array of strings is made available to the process by the execve(2) call when a new program is started.
When a child process is created via fork(2), it inherits a copy of its parent’s environment.
#include <stdio.h>
int main(int argc, char **argv)
{
extern char **environ;
int n;
n = 0;
while (*environ != NULL)
{
printf("%s\n", *environ);
environ++;
}
}
Loops over environ matching var name, returning a pointer to the found var, NJULL otherhwise.
char *find_variable(char **environ, char *var)
{
int len;
len = ft_strlen(var);
while (*environ != NULL)
{
result = ft_strnstr(*environ, var, len);
if (result)
return (*environ);
}
return (NULL);
}
Splits value of PATH variable. For each path, joins "/command" and checks if execution permit is possible.
Returns command full path or NULL if there are not execution permits.
char *arg_fin_com(char *var_val, char *com)
{
char **paths;
char *command;
char *result;
char *slash_command;
int len;
slash_command = ft_strjoin("/", com);
len = 0;
result = NULL;
paths = get_paths(var_val);
while (paths[len] != NULL)
{
command = ft_strjoin(paths[len], slash_command);
if (!result && !access(command, X_OK))
result = command;
len++;
}
return (result);
}
my first validation checks if the filename has legal characters for a filename...any character except '/'.
Also, users may refer to a filename with a relative path such as ../inc/filename inc/filename ./filename
Then, if the filename does not begin with '/' it is a filename relative to the PWD whose absolute path name starts with the PWD variable content. otherwise it is an absolute path. It the filename does not contains ´/´, i have to look for it in PATH
The call to execve() requires a null-terminated list of arguments
Said that, "grep -n printf" transforms easily into a list of strings args ={"grep", "-n", "printf", NULL} with the help of ft_plit and white space.
In the same way, "tr 'a' 'e'", also fits into execve() without problems.
It is not the case with, "tr 'a' ''", that substitutes a by ' if passed as {"tr", "'a'", "'", NULL} It is not the case with, "tr 'a' ' '", that do not execute if passed as {"tr", "'a'", "'", "'", NULL}
The usual leaks --atExit -- ./pipex ....
method to check leaks does not work when the program to check uses redirections.
I used Valgring instead with --trace-children=yes flag.
I discovered that the grep command had leaks.
Thank you to Unai, I will use --track-fds=yes flag with Valgrind in the future.
Abel taught me about grep -v to invert match and `norminette | greo -v OK || echo "Norma OK" '
the environment variable $SHLV tells how deep is the bash you are running.
git remote get-url --all origin
shows the cloud origin of my repository
env -i
ignores the environment
The -fno-omit-frame-pointer flag tells the GCC compiler not ommit the frame pointer (FP) during function call optimization.
Frame Pointer (FP):
The FP is a CPU register that typically stores the address of the current function's stack frame.
The stack frame is a temporary memory area allocated on the stack for each function call. It holds local variables, function arguments, and housekeeping information.
Modern compilers often perform optimizations to improve code efficiency.
One optimization involves omitting (not generating) the frame pointer instruction in some situations.
While omitting the FP can be beneficial, there are situations where it's crucial to have it:
-Debugging: Debuggers rely on the FP to walk the call stack, stepping through function calls and examining local variables. Without the FP, debugging becomes significantly more challenging or even impossible.
-Exception Handling: Some exception handling mechanisms (like catching signals) use the FP to unwind the call stack and restore the programm state after an exception. Omitting the FP can disrupt this process.
-Backtraces: Tools like AddressSanitizer (ASan) or profilers might rely on the FP to generate stack traces for error reporting or performance analysis.
You'd typically use -fno-omit-frame-pointer when:
-You need to debug your program effectively.
-You're using exception handling or profiling tools that rely on the FP.
-You're encountering issues with ASan or other tools requiring the FP for proper operation. (My problem was Loop Printing "AddressSanitizer:DEADLYSIGNAL":)